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Abstract 

There are a large amount of signatures and distances within 
the CBIR field. All of them are only approaches that do not 
produce the results of high quality as desired by the user. In 
order to improve the quality of results, combining features 
seems to be a good idea. In this paper, we address the 
feature's combination issue based on three manners: the re- 
ranking mechanism is employed generally as a relevance 
feedback tool, the utility concept relies on the images rank 
and the third method is based on a mathematic formula that 
conjoins features. The features put under experimentation 
here are: histogram with 27 fixed bins, the 3 color moments: 
mean variance and skewness. This paper then answers three 
questions: firstly, does the combining features which 
requires more complexity practically improve the results? If 
yes, how much it can do that. Secondly, which one of the 
three approaches is the best. And finally which configuration 
yields the best improvement. The results obtained after 
experimentations achieved on the Wang database have 
shown that combining signatures improves surely the 
results and thus no matter what the manner is considered, 
the best approach among all tested approaches is the third 
one of that employs a single formula specifically when 
considering the Mean-Intersection setting. 
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Introduction 

The CBIR system aims at selecting, from a repository, 
a subset of images that satisfy the visual need of the 
user generally expressed as a query image. Contrary to 
TBIR which utilizes the annotation, CBIR system 
employs low level visual features extracted from the 
images themselves (Rui et al, 1999; Zheng et al, 2010; 
Smeulders et al, 2000) such as Color, Texture and 
shape. Each of these descriptors has many signatures. 
Unfortunately, all the signatures proposed do not 
return the results as desired by users. One way for 
improving the quality of the CBIR system is 
combining signatures. Many works can be found in 
the literature that falls under the purview of the 
context of fusion. We can classify them into two 
classes: the first category is fusion of the textual and 
visual features such as (Zhiong et al, 2011) in which. 


the authors have compared between textual and visual 
feature from the performance view point, they have 
concluded that the combination of two information 
sources can consistently enhance the final accuracy. 
The second category is fusion visual features such as: 
(Noureddine Abbadeni, 2009) in which the author has 
proposed an approach based on multiple 
representations, multiple queries, and the fusion of 
results returned by these different representations and 
queries, (Jian et al, 2010) in which authors attribute 
weight for each feature based on the relevance 
feedback, (James et al, 2003) in which, the authors 
have merged the k first results obtained from four 
channels; each channel utilizes some CBIR features. 
There is also third category of fusion of that combining 
between visual features and audio as in (Manohar et al, 
2011). More details in fusion multimodal information 
can be found in (Guan et al, 2010). We address in this 
paper the second category. The question to ask here in 
which step we should combine signatures: during the 
indexation process or during the matching stage or 
combining only the results. We experiment here three 
ways of fusion: the re-ranking considered as fusion in 
a hierarchical manner, the utility concept consisting of 
combining only results and proposing not weighted 
formula that fuses signatures during the matching 
process. Combining features during the indexation 
process as used in (Anne et al, 2001) were not 
performed in our work owing to a multi-dimensional 
curse which might be generated. 

The rest of this paper is arranged as follows: section 2 
covers the first fusion manner considered here which 
is re-ranking. The second manner based on the Utility 
Concept is discussed in section 3. The section 4 is 
devoted to the third manner of that combining feature 
using a single formula. Experiments conducted and 
tools utilized are presented in section 5 with results 
discussion. We end this paper by a conclusion which 
draws some perspectives of this study. 

Re-ranking Mechanism via the Relevance 
Feedback 

Re-ranking consists of re-ordering the first images 
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returned by the system using the user judgment. This 
mechanism is known as relevance feedback tool. 
Relevance feedback has been shown to be a very 
effective tool for enhancing retrieval results in text 
retrieval. In content-based image retrieval, it is more 
and more frequently used and very good results have 
been obtained. In our work, the re-ranking process, 
which aims to refine results, is done automatically by 
applying another signature or another similarity 
measure different to the one that employed during the 
first ranking. This method has been used in (Jaekyong 
et al, 2009) when the authors of this paper have used 
global and local features in hierarchical manner. They 
have applied the Fuzzy C-Means clustering method 
firstly using global features as an indexing signature, 
and then the results obtained have been fined utilizing 
local features. 

Utility Concept 

The second method considered is inspired from the 
utility concept (Fishburn, 1998) which consists in 
assigning higher scores to relevant images in 
descending order of their rank within the returned 
results. The value assigned to each image is given by 
the following formula: 

V = ±*(N-R) (1) 

Where N: is the number of the returned images and R 
is the rank of the image. The value V belongs then to 
the range of 0 to 1. Combining features imposes then 
to count for each image its total V by summing up its 
values over all the considered signatures. Based on the 
new total value, the images have to be ranked and will 
be visualized to the user. 

Combining Signatures in the Same Equation 

This third manner consists of fusion signatures during 
computing similarity. We assign the same weight to all 
signatures considered. The value of distance or 
similarity for each signature is normalized before 
fusion. The distance is normalized by dividing the 
given distance by the greater distance obtained. For 
the similarity, we mean here the intersection; it is 
mapped to normalize distance by the following 
formula: 

™ = 1 - © < 2 > 

Where ND: is the normalized distance obtained, S: is 
the similarity value being mapped and GS: is the 
greater similarity value obtained. 


Tools Experiments and Results 

In this section, we firstly, present the different 
methods and tools utilized during the series of 
experimentation conducted. 

1 ) Indexing Signatures 

We point here the different signatures considered in 
our work. 

• Color Histogram 

A large number of indexing methods for CBIR are 
reported in the literature, one of which is a color 
histogram reported in (Swain and Ballard, 1991). This 
last technique has been employed in many works and 
is admitted as one of the oldest, yet, basic methods for 
CBIR. The histogram itself is a statistic vector, the 
elements of which hold the pixels count for each color 
in the image. 

• Color Moments 

a) The Mean 

u t=^j=lfij (3) 

b) The Variance 

{fa - H 2 ( 4 ) 

c) The Skewness 

«< = (iSf=i(/i;-u,) 3 )’ /3 (5) 

2) Matching Measure 

• Euclidean Distance 

hiF.P') = {lZ=l{p m - Pm)')' 2 ( 6 ) 

Where: F and F' are the vectors being compared. 

• Histogram Intersection 

d n (F q ,F d ) = Y, r j =1 min(F q ,F d ) ( 7 ) 

Where F q and F d are the two histograms to compare 

and n is the number of bins. 

3) Evaluation of Methods under experimentation 

For evaluating the performance of the methods, we 
have used the Precision and Recall measures (Babu et 
al, 1995). The precision is defined as the ratio of 
images retrieved to all images retrieved, while the 
recall is defined as the ratio of relevant images 
retrieved to all relevant images in a database, or the 
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probability is given that an image is relevant that it 
will be retrieved. The both measures are given 
respectively by the followings formulas: 

Precision Numberofrelevantimagesretrieved 

TotalnumberofimagesretrievecL ' 

ReCCill Numberofrelevantimagesretrieved 

Totalnumberofrelevantimagesint heDatabase ' ' 


Results and Discussion 

Before showing the results of the three fusion manners 
being compared, we present firstly the results 
obtained by the primitive signatures without fusion. 

TABLE 1 PRECISION VS. RECALL OVER THE PRIMITIVE SIGNATURES 
CONSIDERED 


All the methods considered have been tested on the 
Wang data base posted on line on 
(http://Wang.ist.psu.edu/docs/related.shtml). 
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FIG. 1 SOME IMAGES REPRESENTING ALL THE CLASSES OF 
THE WANG DATABASE 


Noting that Moments : is a vector containing the three 
low moments: mean, variance and skewness. 
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FIG. 2 THE RESULTS RETURNED WITH THE RE-RANKING STRATEGY OVER DIFFERENT COMBINATIONS 
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FIG. 3 THE RESULTS RETURNED WITH UTILITY CONCEPT STRATEGY OVER DIFFERENT COMBINATIONS 
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• The first fusion strategy: Re-ranking 


The second fusion strategy: Utility concept 


The Fig. 2 presents the results in terms of Precision- 
Recall obtained with the re-ranking manner. 
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the Fig. 2 are respectively: 

Moments _af ter -Variance, 
Varian ce_after_mean, 
Variance _after -intersection, 
Inter section _af ter _moments, 
Mean_after _moments, 
: ter -intersection. 


The quality obtained with re-ranking can be 
deteriorated if the second method used within the re- 
ranking mechanism is worse than the first. 
Mean_after_intersection and intersection_after_mean 
are the best, and are even better than the mean and the 
intersection. 


The Fig. 3 below presents the results in terms of 
Precision-Recall obtained with the Utility concept 
manner. 

While the scenarios given in the Fig.3 are respectively: 

Mean_intersection, Mean -Variance, Mean_moments, 
variance _inter section, variance _moments, 

moments -intersection, mean _moments -Variance, 

mean_moments -intersection, mean_variance -intersection, 
moments -Variance -intersection, 
moments_mean_variance_intersection. 

The fusion with this manner can also deteriorate the 
performance of the primitive method if this one is of 
high quality with respect to other primitive methods. 
Among all the combination tried out, we can say that 
Mean_intersection is the best and it is also better than 
the mean and the intersection. 
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FIG. 4 THE RESULTS RETURNED WITH EQUATION STRATEGY OVER DIFFERENT COMBINATIONS 
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FIG. 5 COMPARISON BETWEEN THE BEST CASES OVER THE THREE FUSION STRATEGY 
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• The third fusion strategy: Equation 

The Fig. 4 presents the results in terms of Precision- 
Recall obtained with the Utility concept strategy. 

While the scenarios given in the Fig. 4 are respectively: 

Mean_intersection, mean_variance, mean_moments, 
variancejntersection, variance jnoments, 

moments, Jnter section, mean_moments -Variance, 

mean_moments intersection, mean -Variance -intersection , 
moments ^variance intersection, 
moments _mean -Variance _inter section. 

The fusion with this manner can also deteriorate the 
performance of the primitive method if this one is of 
high quality with respect to other primitive methods. 
Among all the combination tried out, we can say that 
Mean_intersection is the best and it is also better than 
the mean and the intersection. 

• Comparison between the best cases over all 
fusion strategies 

While the scenarios given in the Fig.5 are respectively: 
Mean 

Intersection 

Inter _after _mean: is re-ranking results obtained 

employing mean signature and so by using Histogram 
Intersection with 27 fixed bins. 

Mean_after inter, is re_ranking results obtained 
employing Histogram Intersection method and so by 
using the mean signature. 

Meanjnter -Concept: is combining the both signatures: 
mean and Histogram using the utility concept fusion 
manner. 

Meanjnetr -formula: is the combination of the both 
signatures: mean and Histogram using a single 
formula. 

Conclusion 

In this paper, we deal with the fusion features 
considered as a solution for improving results returned 
by a CBIR system. Three statistic strategies have been 
experimented: the re-ranking. Utility Concept and a 
single equation. Based on the results found, we claim 
that combining features using a single formula is the 
best way. For obtaining better results, the signatures 
combined should be of high quality. We do not assign 
any weight in our formula, considering weighting can 
also be effective. 
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